Self-templated nucleation in peptide and protein aggregation 
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Peptides and proteins exhibit a common tendency to assemble into highly ordered fibrillar ag- 
gregates, whose formation proceeds in a nucleation-dependent manner that is often preceded by 
the formation of disordered oligomeric assemblies. This process has received much attention be- 
cause disordered oligomeric aggregates have been associated with neurodegenerative disorders such 
as Alzheimer's and Parkinson's diseases. Here we describe a self-templated nucleation mechanism 
that determines the transition between the initial condensation of polypeptide chains into disordered 
assemblies and their reordering into fibrillar structures. The results that we present show that at the 
molecular level this transition is due to the ability of polypeptide chains to reorder within oligomers 
into fibrillar assemblies whose surfaces act as templates that stabilise the disordered assemblies. 



There are two fundamental questions that one can ask 
about the general phenomenon of formation of ordered 
structures by atoms and molecules. The first concerns 
the type of assembly that the given particles can form, 
and the second the kinetic paths that the particles follow 
in order to reach a stable structure. The answer to the 
first question, at least in the cases when the particles are 
rigid, is that in nature there exists only a small number of 
possible crystal structures, corresponding to the 230 crys- 
tallographic space groups pQ. The extraordinary power 
of this result, which is based on symmetry and geome- 
try arguments, is that the answer is independent of the 
specific particles and of their mutual interactions. The 
specific particle properties merely determine which type 
of crystal structure is the most stable. The answer to the 
second question is generally more difficult, and involves 
a nucleation and growth mechanism [2]. In this mech- 
anism the atoms first need to come together to form a 
critical nucleus (nucleation phase), before they can grow 
(elongation phase). The probability P c that a sponta- 
neous fluctuation will result in the formation of a critical 
nucleus depends exponentially on the free energy AF C 
required to form such a nucleus: P c = exp(— AF c /kT), 
where T is the absolute temperature and k is the Boltz- 
mann constant. In atomic systems the activation barriers 
are usually very high, and the probability of observing 
nuclei is very small; even when they form, their lifetime 
is fleetingly short, so that up to now there is no clear 
experimental observation of critical nuclei in atomic sys- 
tems, and computer simulations have become a major 
tool to investigate this phenomenon [3]. 

An additional problem arises when the particles form- 
ing the ordered structures are not rigid, but flexible, as is 
the case of peptides and proteins. Individual molecules 
of this type have often an intrinsic tendency to fold into 
ordered structural patterns, which may either favour or 
hinder their intermolecular assembly process. This prob- 
lem constitutes an entirely new chapter in the study 
of ordering that has very great significance for biology 
and biotechnology. Indeed biomolecules such as DNA 
and proteins have recurrent structural motifs such as a- 



helices and /3-sheets [4], and a wide range of different 
proteins can assemble into highly ordered fibrillar aggre- 
gates [5]. Although the amino acid sequences of these 
proteins are often unrelated, the structures of amyloid 
fibrils show a common characteristic cross-/3 structure in 
which the main axis of individual molecules runs orthog- 
onal to the direction of the filaments. It has thus been 
suggested that the inherent ability to form fibrillar as- 
semblies is a feature common to polypeptide chains [6]. 

In order to explain why this process takes place de- 
spite the remarkable resistance of native states of pro- 
teins to aggregation, a "nucleated conformational con- 
version" mechanism has been proposed in which the for- 
mation of highly dynamic oligomeric assemblies facili- 
tates the further conversion of polypeptide chains into 
ordered fibrillar structures 0. Evidence in favour of 
this mechanism has been provided through experimental 
E3 ED] and theoretical studies [ID H21 H21 IIH HS1 HS1- 
In this work we exploit the ability of computer simu- 
lations to provide a description of molecular process at 
very high resolution - an ability that has proven invalu- 
able in defining nucleation processes in atomic and col- 
loidal systems [3j [17] . In the case of polypeptide chains 
we have recently shown that it is possible to follow the 
aggregation process on a time long enough to include 
both the initial condensation into disordered oligomeric 
assemblies and their subsequent reorganisation into fib- 
rillar structures [18] . and that the nucleated conforma- 
tional conversion mechanism can also be described as 
a "condensation-reordering" mechanism (Fig. [T]). Here 
we investigate a self-templated nucleation mechanism, 
which determines the coupling between the assembly of 
polypeptide chains in disordered oligomers and the trans- 
formation into highly ordered cross-/? structures. A tem- 
plate model by which misfolded prions induce the con- 
version of nearby native prions to the misfolded state 
has already been proposed [T9] , and described for SH3 
[20] and cc/3 [21 . In particular, in the latter case it was 
shown that /^-sheets with exposed hydrophobic surfaces 
and unsaturated hydrogen bonds accelerate the conver- 
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FIG. 1: Condensation-reordering mechanism at c — 12.5raM 
and T* = 0.66 above the folding temperature. Peptides that 
do not form interchain hydrogen bonds are shown in blue, 
those forming interchain hydrogen bonds are assigned a ran- 
dom colour. Peptides within the same /3-sheet are assigned 
the same colour, (a) Initially, at £ < 1000, the peptides are in 
a monomeric state (left panel). The progress variable £ is the 
number of Monte Carlo moves performed in the simulation, 
and one unit of £ is a block of 10 5 Monte Carlo moves. As 
the simulation progresses, a hydrophobic collapse causes the 
formation of a small disordered oligomer (£ = 15000, right 
panel), (b) Enlarged view of a disordered oligomer which 
subsequently orders into a protofllament structure: £ = 15000 
(left), £ = 19000 (middle), £ = 30000 (right). 



sion from native a-helices to /^-sheets. 

The computational strategy that we follow is based 
on the attempt to reproduce simultaneously two com- 
mon aspects of proteins, their ability to form secondary 
structural motifs [22 j, and their propensity to form or- 
dered fibrillar aggregates [5 j. Recently it has be shown 
that motifs such as a helices, /3 sheets and cross-/3 struc- 
tures are natural forms of a marginally compact phase of 
matter characteristic of flexible polymers |23j [24]. The 
description of proteins in terms of flexible tubes captures 
in a simple way the main symmetry of chain molecules. 
In this approach a polypeptide chain is represented by 
a C a chain of finite thickness, which approximately en- 
velops the backbone atoms. The hydrophobic effects due 
to the water are considered by a pairwise additive inter- 
action between different C a atoms, with an energy e#p, 
when they are close. The sequence independent defini- 
tion for hydrogen bonding is obtained by an analysis of 
the geometric properties of hydrogen bond forming C a 
atoms from the Protein Data Bank and assigned an en- 
ergy chb- Steric constraints due to side chains are im- 
posed by local bending stiffness with energy es (for a 
detailed description of the model see [18] [25]). Based 
on the hypothesis that the formation of amyloid fibrils 
is a common feature of all polypeptide chains, which de- 
pends mainly on the generic properties of their backbone 
[26j [27] , we investigated the behaviour of a representa- 
tive model system consisting of 80 weakly hydrophobic 
12-residue homopolymers. Systems of homopolymers [28] 



have been shown experimentally to form amyloid assem- 
blies. In all our simulations the energy of hydrogen bonds 
was set to enB = — 3/cT , a value close to experiments 
(1.5kCal/mol at room temperature [29]). Here kT Q is a 
reference thermal energy and the reduced temperature is 
T* = T/T Q . The hydrophobic and stiffness energy are 
set to enp = —0.1bkT o and es — 0.9kT o respectively. 
The ratio enB/eHP = 20 is such that these interactions 
provide similar contributions to the potential energy of 
the oligomer. With this choice of parameter the pep- 
tides form an a helical native structure below the folding 
temperature ~ 0.6, and a random coil above. 

In order to illustrate the condensation-reordering tran- 
sition (Fig. [T]) we set the peptide concentration to c = 
12.5mM and the reduce temperature to T* = 0.66, to 
keep the lag time prior to aggregation very short. Low- 
ering the concentration, while keeping the temperature 
constant, results in a dramatic increase of the lag time. 
At c = ImM the peptides remain monomeric on the 
timescale that we have been able to follow, although 
the aggregated state is likely to be much more stable 
than the monomeric state. In this concentration regime 
(c = ImM to c = 12.5mM) the monomeric state is 
metastable with respect to the aggregated state, and the 
aggregation of polypeptide chains follows a nucleation 
mechanism [30]. Under such conditions we calculated 
the nucleation barriers associated with the condensation- 
reordering transition. 

A prerequisite for such a calculation is the ability to 
describe quantitatively the formation of small oligomeric 
assemblies. By using a standard cluster criterion, i.e. any 
two peptides whose centre of mass distance is less than 
5A belong to the same oligomer, it is possible to define 
the oligomer size n, that corresponds to the number of 
peptides within the oligomer. At the same time it is 
possible to measure the /3-sheet content of the oligomer. 
As the formation of /^-sheets is driven by interchain hy- 
drogen bonding, we use m, the number of interchain hy- 
drogen bonds formed within an oligomer, as a structural 
observable. In order to calculate the joint equilibrium 
probability P(n, m) for the formation of an oligomeric 
assembly consisting of n peptides with m interchain hy- 
drogen bonds we performed biased Monte Carlo simula- 
tions [T7j[T8] in the canonical ensemble using crankshaft, 
pivot, reptation, rotation, and translation moves. As a 
biasing potential we included an additional parabolic en- 
ergy term, W = a(m - mo) 2 , in the energy function, 
where a and mo are parameters, that can be used to con- 
trol the range of m values sampled in the simulation. The 
calculation of P(n, m) was split into 28 independent sim- 
ulations for different mo values, and finally combined into 
one by a multi-histogram technique. The corresponding 
nucleation free-energy landscape, apart from an additive 
constant, is given by F(n, m) = — kT* ln[P(n, m)]. A rep- 
resentative calculation of such a free energy landscape 
obtained at c = \2mM and T* = 0.51 (i.e. below Tf) 
is shown in Fig. [2^i. The succession of local minima in 
this landscape reveals the central result of this work - 
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FIG. 2: (a) Contour plot of the nucleation barrier F(n,m) calculated at concentration c = 1.2raM and reduced temperature 
T* = 0.51, which is below the folding temperature. The black circles indicate the minima on the free energy surface, and 
the black line indicates a possible path that connects them. The labels (n a ,np) of the minima describe the structure of the 
oligomer, where n a and np are, respectively, the number of peptides in a a- helical and /3-strand structure, and n = n a + np. 
In the inset we show snapshots of the oligomers associated with the minima, (b) Nucleation barrier for /2-sheet formation, 
F±(m), as a function of the number of interchain hydrogen bonds m that are formed within the oligomer, (c) Nucleation barrier 
F2(n,m m ax) for the formation of an oligomer of size n that can at most form m m ax interchain hydrogen bonds: m m ax = 10 
(np < 3) (black line), m max = 20 (np < 4) (red line), all m values included (green line). 



the presence of a coupling between the oligomer size n 
and the number of hydrogen bonds m in the states of 
minimal free energy for the oligomers. To clarify the na- 
ture of this coupling we analysed the structure of the 
oligomers formed in the local minima of the free en- 
ergy landscape. As an example, we show in the inset 
of Fig. [2^i that the oligomer corresponding to the mini- 
mum (n = 13, m = 68) consists of np = 8 peptides in 
a (3 sheet conformation and n a = 5 in a helical confor- 
mation. Here we define two peptides to be part of a 
/3-sheet if they form more than four interchain hydrogen 
bonds with each other. Since the peptides within the 
oligomer are either in a a-helical or /3-strand conforma- 
tion, n = n a +np, we labelled each minimum by the pair 
(n a ,np). Typical configurations of other oligomers are 
also shown in the inset of Fig. [2^i. We further analyse 
this coupling in Fig. [3j where we plot, for the states cor- 
responding to the minima of the free energy, the number 
n a of peptides in an a-helical conformation within an 
oligomer as a function of the number np of peptides in a 
/3-sheet conformation. The essentially linear relationship 
indicates that the probability of oligomeric assemblies in- 
creases with the size of the ordered /3-sheet structures. 

To reveal the molecular basis of this coupling, we have 
analyzed the nucleation barrier for /3-sheet formation in- 
dependent of the number of peptides in a a-helical confor- 
mation. The average over n is achieved by the marginal- 
isation of P(n, m) with respect to n, and its correspond- 
ing free-energy profile is F\(m) = —kT* ln[^ n P(n, m)\. 
The calculations indicate that F\ (m) is comprised of a se- 
ries of component barriers, each separated by Am = 10 
(Fig.^). Since each pair of peptides considered here in a 



/3-sheet conformation can form at most ten interchain hy- 
drogen bonds with each other, each maximum of F\(m) 
corresponds to the addition of a new peptide to the ex- 
isting /3-sheet structure within the oligomer. After the 
first few interchain hydrogen bonds are formed, the free 
energy decreases until an optimal number of hydrogen 
bonds is formed, corresponding to a local minimum. The 
elongation barrier for /3-sheet nucleation corresponds to 
the free energy needed to transform a peptide from its 
a-helical conformation to the extended structure it has 
in its /3-sheet conformation. The elongation barrier is 
a quantitative measure for the aggregation propensities 
of proteins, and can be measured experimentally [27]. In 
our calculation a dimer formed by two /3-strands is always 
unstable and disassembles soon after it forms; the critical 
size of the /3-sheet is a tetramer, at least in the range of 
concentrations and temperatures that we considered. 

Next we investigated how the nucleation barrier for 
the formation of an oligomer depends on its internal 
structure, i.e. if a /3-sheet is present or not. The 
structural average over m is obtained by resorting to 
the marginal distribution function P(n, m < m max ) = 
Ylm=Q X ^( n > m ) an d the corresponding free-energy pro- 
file is given by F 2 (n, m max ) -kT* lnP(n, rn < m max ). 
Here F2(n 1 m rnax ) is the nucleation barrier for forming 
an oligomer of n peptides, with at most m max interchain 
hydrogen bonds. This upper limit to the number of in- 
terchain hydrogen bonds is introduced in the projection 
operation with the goal of unveiling the role of the /3- 
sheet structure formed within the oligomer in the dy- 
namical evolution of the aggregation process. If we do 
not allow the formation of /3-sheets consisting of more 
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FIG. 3: Correlations for the successive minima in the free 
energy landscape (such as shown in Fig. [2^l) between the 
number n a of peptides in a a-helical conformation and the 
number np of peptides in an /3-sheet conformation, (a) Cor- 
relations obtained at T* = 0.51 and c = 0.64raM (green line), 
c = 1.2raM (red line) and c = 2.9raM (black line), (b) Typi- 
cal configurations of the oligomers associated with the minima 
obtained at T* = 0.51 and c = 2.9raM (black line in (a)). 



than two peptides, by imposing m max = 10, the free 
energy for the formation of an oligomer increases mono- 
tonically (Fig. [2^, black line). If we do allow the for- 
mation of /^-sheets consisting of up to three peptides, by 
relaxing the constrain to m max = 20, we observe that 
after an initial increase, for n < 5, a local minimum is 
present at n m i n — 10 (Fig. [2^, red line). After that, 
for n > n m i n , the free energy increases again because a- 
helical oligomers are not stable under these conditions. 
The existence of a local minimum of this type has never 
been observed in atomic or molecular systems where the 
nucleation barrier increases monotonically until a critical 
size, and then decreases monotonically for larger sizes [2]. 
Inclusion of all m values sampled during simulation in 
the summation relieves all constrains on the number of 
/^-sheets formed within the oligomer and we have found 
that the position of the local minimum of F2 moves to a 
larger value n m i n ~ 15 (Fig. green line). 

These results provide a molecular description of the 
origin of the coupling between the nucleation events lead- 
ing to the formation of fibrillar structures. The ability 
of polypeptide chains to reorder within oligomers into 
a fibrillar structure stabilises oligomeric assemblies since 
the surface of a growing /3-sheet acts as a substrate for 
the attachment of other a helical peptides. This self- 
templated nucleation mechanism also exists for the aggre- 
gation process above the folding temperature (see Fig.[T]), 
but the associated nucleation barriers are smaller. In 
systems where proteins form complex structural motifs 
in the monomeric phase, the activation barriers for (3- 
sheet aggregation are likely to be higher, and the tem- 
plating effect in the nucleation mechanism might be more 
pronounced. A better understanding of this mechanism 



should lead to an increasing ability to modulate the 
growth of peptide and protein aggregates, and it should 
play an important role in the development of therapies 
for conditions such as Alzheimer's and Parkinson's dis- 
eases [31]. 
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